View AN547_4306770.PDF datasheet online --- IC-ON-LINE

Datasheet File OCR Text:

AN547/07/92 cascading imsa110s application note 1. introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. operation of a single imsa110 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2.1 one dimensional operation of an ims a110. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2.2 two dimensional operation of an ims a110. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3. fundamentals of cascading imsa110s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. cascading imsa110s to produce long one dimensional filters . . . . . . . 4 5. cascading imsa110s to produce wider two dimensional filters . . . . . . 5 6. cascading imsa110s to produce higher two dimensional filters . . . . . 5 7. cascading imsa110s to produce wider and higher . . . . . . . . . . . . . . . . . . . . 7 two dimensional filters 8. cascading imsa110s to perform multi pass filtering operations . . . . . 8 9. cascading imsa110s for increased data precision . . . . . . . . . . . . . . . . . . . . 9 9.1 increasing data precision with an external 22 bit adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9.2 increasing data precision with an external delay line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 9.3 increasing data precision with no external hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 10. cascading ims a110s for increased coefficient precision . . . . . . . . . . . . 11 10.1 increasing coefficient precision with an external 22 bit adder . . . . . . . . . . . . . . . . . . . . . . . 11 10.2 increasing coefficient precision with an external delay line . . . . . . . . . . . . . . . . . . . . . . . . . 12 10.3 increasing coefficient precision with no external hardware . . . . . . . . . . . . . . . . . . . . . . . . . 13 11. summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1. introduction the imsa110 is a single-chip programmable and cascadable device suitable for many high speed image and signal processing applications. it con- sists of a configurable array of multiply-accumula- tors (420 mops), three programmable length 1120 tage shift registers, a versatile post-processing unit and a microprocessor interface for configuration and control purposes. the comprehensive on-chip facilities makes a single device capable of dealing with many image processing operations. a simpli- fied block diagram is shown in figure 1. for some applications however, the power and versatility of a single imsa110 is not sufficient, in these cases a cascade of devices often provides a solution. the purpose of this document is to de- scribe some of the most useful ways to cascade imsa110s to achieve even higher performance and as such does not cover the use of the backend processor or device applications. 2. operation of a single imsa110 the a110 may be set up as either a one or two dimensional multiplier accumulator array (mac). 2.1 one dimensional operation of an imsa110 for one dimensional operation the first delay psrc is set to some arbitrary value (normally zero) while psrb and psra are set to zero. n.b. at any given point in time the first mac stage in bank c is 1/13
shifter x psrin 1d 3d psrc : 0-1120 1d 1d x 1d x 1d 1d bank c x 1d 3d psrc : 0-1120 1d x 1d x 1d 1d x 1d 3d psrc : 0-1120 1d x 1d x 1d 13d bank a bank b 1d psrout data conditioner etc. 6d casout * casin AN547-01.eps figure 1 : block diagram of the imsa110s processing the oldest data while the last mac stage of bank a is processing the newest data. 2.2 two dimensional operation of an imsa110 for two dimensional operation the first delay (psrc) is again set to some arbitrary value; how- ever, the setting of psra and psrb is dependant on the line length in pixels of the image being processed. it turns out that in order to achieve a rectangular convolution window the number of de- lays to be programmed into psra and psrb is equal to the line length in pixels plus the length of the mac pipelines (seven stages). for example if the screen width of the image to be processed is 512 pixels then the delay to be programmed into shift registers psra and psrb is 519. n.b. normally when processing an image with an arbitrary setting of psrc the delay (latency) through the imsa110 causes the output image to be incorrectly aligned or skewed. this results in an apparent rotation of the output image in the hori- zontal plane. to correct this problem psrc may be adjusted to introduce a suitable number of delays to shift the image into the correct position. typically image data is fed into an imsa110 line by line starting at the top left and ending at the bottom right. given this definition it may be seen that the first mac stage in each row is processing the data nearest the left hand side of the screen (the oldest data) and that the last mac stage in each row is processing the data nearest the right hand side of the screen (the newest data). in a similar fas hion the first row is always processing the newest data (the data nearest the bottom of the screen) and the last row is always processing the oldest data (the data nearest the top of the screen). it is important to bear in mind these relationships when program- ming imsa110s, otherwise the operation being performed on an image may not be what was expected. cascading imsa110s 2/13
3. fundamentals of cascading imsa110s consider a single imsa110 configured to perform some task on a stream of data values. the filter kernel formed by the coefficients may be thought of as a block passing over the data. to produce bigger filters it is necessary to join a number of separate blocks together. this may be achieved by connecting together a number of imsa110s, as shown in figure 2, and configuring them suitably. in order to create a contiguous filter kernel (i.e. a filter without overlap or gaps) it is essential that the route between psrin and psrout for each device is programmed correctly and that the internal delay lines are programmed to the correct lengths. to assist in the calculation of the delays to be programmed into the programmable shift registers it is convenient to define a reference data path through the mac of any given imsa110. in this document, unless specified, the reference path is taken to be from the input to the multiplier marked with an asterisk (*) in figure 1 to the cascade adder marked with a hash (#). in addition before embarking on any calculation it is necessary to know the following : 1 the delay between psrin and psrout when the data is routed directly from psrin to psrout without passing through the programmable shift registers. this delay is known as d d . 2 the delay along the reference path. this delay is known as d r . 3 the delay through the backend between cascade in and cascade out. this delay is known as d b . 4 the locations of the other inherent delays within imsa110s. 5 the meaning of line length, kernel width and kernel height. see figure 3 for a definition of these terms. figure 1 shows a functional block diagram of an imsa110 with all the inherent delays included. from this diagram it is possible to calculate the value of the three delay constants as shown in table 1. table 1 d d =1+1 d r =(1+1)+(7+1)+(7+13) d b =6 d d =2 d r =30 imsa110 psrin casin psrout casout imsa110 psrin casin psrout casout device n device n - 1 AN547-02.eps figure 2 : standard connection for cascading imsa110s filter kernel h w l l = line length w = kernel width h = kernel height AN547-03.eps figure 3 : depiction of line length, kernel width and kernel height cascading imsa110s 3/13
4. cascading imsa110s to produce long one dimensional filters a single imsa110 is capable of producing a one dimensional filter with up to 21 taps (shorter filters may be made by setting unrequired coefficients to zero). to create longer filters it is necessary to cascade a number of imsa110s together. each additional device added to the cascade gives an additional 21 taps allowing filters of almost unlim- ited size to be built from simple building blocks. to develop the delays required to be set up in a one dimensional cascade the system shown in figure 4 will be considered. this system only contains two devices but will be examined in a general way so that rules may be developed for cascades of arbi- trary length. it has already been mentioned how to set up the delays to achieve one dimensional con- volution in a single device. fortunately, in cascades of imsa110s the data relationships within each device are the same as those which would exist inside a single non cascaded device processing the same data. hence, in the one dimensional cascade under consideration the delays programmed into psra and psrb of each device are zero. in order to cascade imsa110s into long one dimen- sional filters the data is normally routed directly from the input to the output of each device without passing through the programmable shift registers, as shown in figure 4. it may be seen that each piece of data takes two routes through the cascade. one route generates partial results via the mac of device n-1 and the other via the mac of device n. these partial results are eventually combined at the cascade adder in the backend of device n. to produce the correct result it is important that these two separate data streams are aligned correctly. assuming that the delay in the psrc of device n-1 is x n-1 and that the delay in psrc of device n is x n , it is desired to calculate the relationship between these delays for correct combination of the partial results. consider an item of data when it reaches device n-1. the delay before the component due to this data, flowing via the reference path in device n-1, reaches the cascade adder of device n is: d n - 1 = 1 + x n - 1 + 1 + 3 + d r + d b d n - 1 = 41 + x n - 1 similarly the delay before the component due to this data, flowing via the reference path in device n, reaches the cascade adder of device n is: d n = d d + 1 + x n + 1 + 3 + d r d n = 37 + x n now, for a contiguous convolution kernel, it is de- sired for the results flowing via the mac of device n-1 to arrive at the cascade adder of device n, 21 clock cycles behind those which have come from the other route. hence: d n - 1 - d n = 21 41 + x n - 1 - 37 - x n = 21 x n - 1 = x n + 17 this means that the psrc of device n-1 must be programmed with the value which is in psrc of device n plus a fixed constant of 17. this rule may be extended to take into account any number of devices providing that the maximum length of the delay lines is not exceeded. the psrc of the last device in the cascade may be programmed to an arbitrary value (normally zero) providing the maxi- mum length of the first psrc delay in the cascade is not exceeded. 1d 1d 1d 1d psrin casin psrout casout psrout casout psrin casin device n device n - 1 0 0 0 0 n x n - 1 x AN547-04.eps figure 4 : direct data path connection for cascading imsa110s cascading imsa110s 4/13
for example consider the problem of filtering a data stream with a 50 tap filter. this could be achieved by cascading three imsa110s. typical delays which would have to be programmed into the de- vices are given in table 2. table 2 device 1 device 1 device 2 psra 0 0 0 psrb 0 0 0 psrc 34 17 0 5. cascading imsa110s to produce wider two dimensional filters a single imsa110 is capable of filtering an image with a two dimensional kernel which has a maxi- mum width of seven cells (narrower filters may be made by setting unrequired coefficients to zero). to create wider filters it is necessary to cascade a number of imsa110s together. each additional de- vice added to the cascade increases the maximum width by an additional 7 cells, allowing filters of almost unlimited width to be created. the connections required to cascade imsa110s into horizontal cascades may be seen in figure 4. it may be noted that the connections for this type of cascade are identical to those presented in section 4 for one dimensional cascading. the dif- ference in function is achieved by changing the delays present in the programmable shift registers. it was mentioned in section 2 that for two dimen- sional filtering using a single device the length of psra and psrb have to be programmed to the line length plus sev en. hence to ensure correct alignment of the rows of the filter in a horizontal cascade it is necessary that psra and psrb of each of the devices must also be set to this value. in order to cascade horizontally the pixel data is normally routed directly from the input to the output of each device without passing through the pro- grammable shift registers. as before it may be seen that each item of data (pixel) takes two routes through the cascade. by assuming that the delay in the psrc of device n-1 is x n-1 and that the delay in psrc of device n is x n , then the route delay equations derived are the same as those calcu- lated in section 4. d n - 1 = 41 + x n - 1 d n = 37 + x n now, for a contiguous convolution kernel, it is de- sired for results flowing via the mac of device n-1 to arrive, at the cascade adder of device n, 7 clock cycles behind those which have come from the other route. this may be achieved by ensuring that the data passing via mac n-1 takes 7 cycles longer than data passing via the mac n route. hence: d n - 1 - d n = 7 41 + x n - 1 - 37 - x n = 7 x n - 1 = x n + 3 this means that the psrc of device n-1 must be programmed with the value which is in psrc of device n plus a fixed constant of 3. this rule may be extended to cascade any number of devices provid- ing that the maximum length of the delay lines is not exceeded. the value programmed into the psrc of the last device in the cascade is arbitrary (normally adjusted to deskew the output image) but must not be set so high that the psrc of the first device in the cascade exceeds its maximum. for example consider the problem of filtering a 1024 pixel wide image with a 15 3 filter kernel. this could be achieved by cascading three imsa110s into a horizontal cascade. typical delays which would have to be programmed into the devices are given in table 3. table 3 device 1 device 2 device 3 psra 1031 1031 1031 psrb 1031 1031 1031 psrc 6 3 0 6. cascading imsa110s to produce higher two dimensional filters the maximum height of a two dimensional filter kernel produced by a single imsa110 is three cells. this is restricting in some applications but, may be easily overcome by cascading a number of imsa110s into a single vertical strip. the theoreti- cal maximum height of filter which can be created is equal to three times the number of devices cascaded. hence the vertical filter size is limited only by the number of devices used. to develop the delays required to be setup in a vertical cascade the system shown in figure 5 will be considered. this system only contains two de- vices but will be examined in a general way so that rules may be developed for cascades of arbitrary length. it was mentioned in section 2 that for two dimensional filtering using a single device the length of psra and psrb have to be programmed to the line length plus seven (l+7). obviously to ensure correct alignment of the rows of the filter in a vertical cascade it is necessary that psra and psrb of each of the devices must also be set to this value. cascading imsa110s 5/13
to cascade vertically the pixel data is normally routed from the input to the output of each device via the programmable shift registers (see figure 5). again it may be seen that each pixel takes two routes through the cascade. one route generates partial results via the mac of device n-1 and the other via the mac of device n. these partial results are eventually combined at the cascade adder in the backend of device n. in order to produce the correct result it is important that these two data streams are aligned correctly. assuming that the delay in the psrc of device n-1 is x n-1 and that the delay in psrc of device n is x n , it is desired to calculate the the relationship be- tween these delays for correct combination of the partial results. consider a pixel when it reaches device n-1. the delay before the component due to this pixel, flowing via the reference path in device n-1, reaches the cascade adder of device n is: d n - 1 = 1 + x n - 1 + 1 + 3 + d r + d b d n - 1 = 41 + x n - 1 similarly the delay before the component due to this pixel, flowing via the reference path in device n, reaches the cascade adder of device n is: d n = 1 +( x n - 1 + 1 )+( l + 7 + 1 )+( l + 7 + 1 )+ 1 + 1 +( x n + 1 )+ 3 + d r d n = 54 + 2l + x n - 1 + x n but for a contiguous convolution kernel it is desired for results flowing via mac n to arrive, at the cascade adder of device n, three line lengths after those which have come from the other route. this may be achieved by ensuring that the data passing via mac n takes 3l (where l is the line length in pixels) cycles longer than data passing via the mac n-1 route. hence: d n - d n - 1 = 3l 54 + 2l + x n - 1 + x n - 41 - x n - 1 = 3l x n = l - 13 this means that the psrc of device n must be programmed with a value which is equal to the line length minus a fixed constant of 13. this rule may be extended to cascades containing any number of devices providing that the maximum length of the delay lines is not exceeded. n.b. the setting of the psrc of the first device in the cascade is arbitrary and may be adjusted to deskew the output image. for example consider the problem of filtering a 512 pixel wide image with a 7 7 filter kernel. this could be achieved by cascading 3 ims a110s into a vertical cascade. typical delays which would have to be programmed into the devices are given in table 4. table 4 device 1 device 2 device 3 psra 519 519 519 psrb 519 519 519 psrc 0 499 499 1d psrin casin psrout casout device n device n - 1 n - 1 x 1d 1d psrin casin psrout casout 1d x n l + 7 l + 7 l + 7 l + 7 AN547-05.eps figure 5 : indirect data path connection for cascading imsa110s cascading imsa110s 6/13
7. cascading imsa110s to produce wider and higher two dimensional filters to produce filters which are both wider and higher than allowed by a single imsa110 it is possible to cascade a number of the wider filters discussed in section 5 into a vertical strip. the connections required to cascade imsa110s into two dimensional cascades may be seen in figure 5. the system shown has arbitrary width but only two rows of devices allowing a maximum filter height of six cells. however the system will be examined in a general way so that rules may be developed for cascades of arbitrary height. it may be noted that across each row, except for the last device, direct connection is used between psrin and psrout. the last device uses the indirect route via the programmable shift registers to connect to the first device of the next row. since each row of this cascade consists of a horizontal cascade the rules developed for the delays in such a cascade (see section 6) apply to each row of this larger configuration. however, the relationship between the delays in the vertical direction requires careful consideration. assuming that the array of imsa110s contains m devices in the horizontal direction and that the delay in psrc of each device is as shown in figure 6, it is desired to calculate the relationship between these delays for correct combination of the partial results generated by each row within the cascade. consider a pixel when it reaches device n-1,1. the delay before the first component due to this pixel, flowing via the reference path in device n-1,1, reaches the cascade adder of device n,1 is: d n - 1,1 = 1 + x n - 1,1 + 1 + 3 + d r + d b m d n - 1,1 = 35 + 6m + x n - 1,1 1d psrin casin psrout casout 1d 1d psrin casin psrout casout 1d 1d psrin casin psrout casout 1d 1d psrin casin psrout casout 1d l + 7 l + 7 l + 7 l + 7 l + 7 l + 7 l + 7 l + 7 device n , 1 device n , m device n - 1, m device n - 1, 1 x n - 1, 1 , 1 x n , m x n - 1, m x n AN547-06.eps figure 6 : connections for cascading imsa110s into wider and higher 2-d filters cascading imsa110s 7/13
similarly the delay before the component due to this pixel, flowing via the reference path in device n,1, reaches the cascade adder of device n,1 is: d n ,1 = 2 ( m - 1 ) + 1 + ( x n - 1 , m + 1 ) + ( l + 7 + 1 ) + ( l + 7 + 1 ) + 1 + 1 + ( x n ,1 + 1 ) + 3 + d r d n,1 = 52 + 2m + 2l + x n - 1,m + x n,1 but for a contiguous convolution kernel it is desired for results flowing via mac n,1 to arrive, at the cascade adder of device n,1 a period of 3l clock cycles after those which have come from the other route. hence: d n,1 - d n - 1,1 = 3l 52 + 2m + 2l + x n - 1,m + x n,1 - 35 - 6m - x n - 1,1 = 3l 17 - l - 4m + x n - 1,m + x n,1 = x n - 1,1 now it is also known from section 5 that any given device in a row except the final device has psrc programmed to 3 more than the device which follows. this leads to the following relationship between the delays programmed into the first and the last devices of the top row: x n - 1,1 = x n - 1,m + 3 ( m - 1 ) by substituting this result into the previous result gives: x n,1 =7m+l-20 this means that the psrc of device n,1 must be programmed with the value which is equal to 7 times the number of devices cascaded horizontally plus the line length minus a fixed constant of 20. this rule may be extended to cascades containing any number of devices providing that the maximum length of the delay lines is not exceeded. n.b. the setting of psrc of the right most device in the first row is arbitrary, but is normally adjusted to deskew the output image. for example consider the problem of filtering a 512 pixel wide image with a 9 9 filter kernel. this could be achieved by cascading six imsa110s into a cascade containing three rows of two devices. typical delays which would have to be pro- grammed into the devices are given in table 5. 8. cascading imsa110s to perform multi pass filtering operations in addition to being able to cascade imsa110s for increased filter size it is also possible to cascade devices to perform multi pass filtering operations. for example consider the problem of edge detec- tion in a noisy image. this task is often performed in two stages the first is low pass filtering to reduce the amount of noise and the second is the edge detection operation. this complete task may be performed by cascading two imsa110s as shown in figure 7. note that only an eight bit window of casout from the first device is connected to psrin of the second device. to configure such a cascade to perform the double filtering operation each device is considered sepa- rately and the delays are setup as described in section 2. for the example under consideration the coefficients of the first device are configured to perform the low pass filter operation while the coefficients of the second device are configured as an edge detector. this technique of multi pass filtering can obviously be extended to include more devices or it may be combined with the cascading techniques dis- cussed in earlier sections to allow multi pass filter- ing with larger filter sizes. it is possible to use a single device for multi pass filtering. this technique works by feeding back alternate cascade outputs to psrin, and making use of bank swapping. figure 8 shows the basic setup. the disadvantages of this method are: 1 the maximum data throughput is halved. 2 the maximum filter size is reduced. 3 external logic is required. to setup such a system requires careful program- ming to achieve the desired result. for example consider the problem of passing the local averag- ing filter kernel shown below over an image twice. 1 1 1 1 1 1 1 1 1 it may be shown using similar techniques to those presented earlier that the delays to be pro- grammed into the programmable shift registers a and b are: 2l + 7 this value is equal to twice the line length plus the length of the mac pipelines. n.b. logical reasoning would have lead to the same result by considering that the data rate within the device is equal to twice the rate of the applied image data. table 5 device 1,1 device 1,2 device 2,1 device 2,2 device 3,1 device 3,2 psra 519 519 519 519 519 519 psrb 519 519 519 519 519 519 psrc 3 0 506 503 506 503 cascading imsa110s 8/13
to create the correct filter kernels it is very impor- tant that the coefficient registers are programmed correctly. each filter is programmed into one of the two coefficient banks, and every odd coefficient must be set to zero otherwise the two interleaved data streams will corrupt each other. the table below shows how the coefficients should be pro- grammed for the example under consideration. cr0 a 1 0 1 0 1 0 0 b 1 0 1 0 1 0 0 c 1 0 1 0 1 0 0 cr1 a 1 0 1 0 1 0 0 b 1 0 1 0 1 0 0 c 1 0 1 0 1 0 0 9. cascading ims a110s for increased data precision in some high precision applications the 8 bit word length of a single imsa110 is not sufficient. this section presents three techniques to overcome this problem. the first two combine imsa110s with simple external hardware, the last one requires no external hardware but does place certain restric- tions on the coefficients and the data. 9.1 increasing data precision with an external 22 bit adder the first technique makes use of an external 22 bit adder in the configuration shown in figure 9. at the input each 16 bit input v alue is split into two 1d 1d psrin casin psrout casout arbitrary 8 22 2l + 7 2l + 7 AN547-08.eps figure 8 : multi-pass filtering by using feedback 1d 1d psrin casin psrout casout psrout casout psrin casin l + 7 l + 7 l + 7 l + 7 arbitrary arbitrary 1d 1d AN547-07.eps figure 7 : cascading imsa110s for multi-pass filtering cascading imsa110s 9/13
8 bit words one containing the least significant 8 bits and the other containing the most significant 8 bits. each of these 8 bit data streams is fed into an imsa110. if the data is unsigned then both of the devices must be set to unsigned data operation. however, if the data is signed then in order to correctly process the data and preserve the sign information it is necessary for the least significant byte to be processed as unsigned data and the most significant byte to be processed as signed data (see figure 9). this may be easily achieved by setting or clearing bit 2 of the scr register in each imsa110 as appropriate. the 22 bit partial results from each device are combined by making use of a 22 bit adder. this adder forms the sum of the top 14 bits (sign extended to 22 bits if signed data is being used) of the least significant partial result and the full 22 bits of the most significant partial result to give the upper 22 bits of the final result. this is combined with the lower 8 bits of the least significant partial res ult to give the complete 30 bit result. see figure 10 for a graphical repre- sentation of this. note that for signed data, the least significant par- tial result must be signed extended to 30 bits as shown in figure 9. the sign extension is easily achieved by connecting the most significant bit of the least significant partial result to the most signifi- cant 8 bits of the adder input. this technique may be extended to give data pre- cisions above 16 bits, however, such precisions are rarely used in practice. sometimes it may be de- sired to combine a bigger filter size, as discussed in earlier sections, with increased precision. such a system is simple to create and just involves replacing each imsa110 in figure 9 with the appro- priate cascade of devices. similarly multi pass filtering, as discussed in section 8, may be com- 22 bits 30 bits sign ex 14 bits 8 bits ls partial result ms partial result final output AN547-10.eps figure 10 : calculation of the final output bined with increased precision. this is achieved by selecting a 16 bit window from the output of the system shown in figure 9 and feeding this into the input of another high precision stage. 9.2 increasing data precision with an external delay line as an alternative to using an external adder it is possible to make use of the cascade adder built into each imsa110 and an external delay line (of length d b ) as shown in figure 11. the rules discussed earlier in this section about signed data apply equally to this configuration. this means that if signed data was being processed then the left and right hand devices in the diagram would have to be configured for unsigned and signed operation respectively. also casin of device n would have to be sign extended to 22 bits as described in section 9.1. the one other considera- tion when increasing the data precision in this way is the number delays required in the programm able shift registers of each device. s sign ext imsa110 unsigned data imsa110 signed data 16 ls 8 22 8 14 30 ms 8 22 22 22 AN547-09.eps figure 9 : cascade of imsa110s for increased data precision (signed data) cascading imsa110s 10/13
1d 1d psrin casin psrout casout psrout casout psrin casin device n device n - 1 n x n - 1 x psrb psra psrb psra ls 8 22 14 16 ms 8 1d 1d 6d 8 output AN547-11.eps figure 11 : alternative cascade of imsa110s for increased data precision (unsigned data) obviously the setings of psra and psrb are not affected by the presence of another device and are setup as described in section 2. the setting of psrc for each device however is important, and incorrect setting will result in erroneous calculation of the most significant 22 bits of the result. assuming that the delay in the psrc of device n-1 is x n-1 and that the delay in psrc of device n is x n , it is desired to calculate the relationship between these delays for correct combination of the partial results. consider an item of data when it reaches device n-1. the delay before the component due to this data, flowing via the reference path in device n-1, reaches the cascade adder of device n is: d n - 1 = 1 +( x n - 1 + 1 )+ 3 + d r + d b = d n - 1 = 41 + x n similarly the delay before the component due to this data, flowing via the reference path in device n, reaches the cascade adder of device n is: d n = 1 +( x n + 1 )+ 3 + d r = d n = 35 + x n now for the data to be correctly aligned at the cascade adder of device n the delay along each path must be the same. hence: d n - 1 - d n = 0 41 + x n - 1 - 35 - x n = 0 x n = x n - 1 + 6 this means that the psrc of device n must be programmed with the value which is in psrc of device n-1 plus a fixed constant of 6. obviously this technique of increasing data preci- sion may be extended beyond 16 bits, or may be combined with other cascading techniques to give larger filter sizes etc 9.3 increasing data precision with no external hardware if the data and coefficients are such that only 22 bits or less are required to represent the result then it is possible to increase the data precision with no external hardware. the connections required are similar to those shown in figure 11. however, the 6 stage delay must be removed and the full 22 bits of casout from the first device must be connected to casin of the second device. to correctly sum the two contributions of the result, it is necessary to left shift the mac output of the second device 8 places to the left. this shift is easily performed using the shifter in the second device, however, care must be taken to ensure that overflow does not occur. if such an overflow does occur then it will not be detected. 10. cascading imsa110s for increased coefficient precision section 9 described three different techniques for increasing data precision by cascading imsa110s. in this section three very similar techniques are presented for increasing coefficient precision. 10.1 increasing coefficient precision with an external 22 bit adder the first method makes use of an external 22 bit adder as shown in figure 12. at the input each 8 bit value is fed to psrin of both the ims a110s. the cascading imsa110s 11/13
1d 1d psrin casin psrout casout psrout casout psrin casin device n device n - 1 n x n - 1 x psrb psra psrb psra ls 8 22 14 1d 1d 6d 8 output AN547-13.eps figure 13 : alternative cascade of imsa110s for increased coefficient precision (unsigned coefficients) device at the top of the diagram is programmed with the least significant 8 bits of the coefficients and the device at the bottom is programmed with the most significant 8 bits of the coefficients. if the coeffi- cients are unsigned then both of the devices must be set to unsigned coefficient operation. however, for signed coefficients, in order to correctly process the data and preserve the sign information it is necessary for unsigned and signed coefficient op- eration to be set in the top and bottom devices respectively (see figure 12). this may be easily achieved by setting or clearing bit 3 of the scr register in each ims a110 as appropriate. also, sign extension must be performed as described in sec- tion 9.1. the 22 bit partial results are then combined in exactly the same fashion as described in sec- tion 9. as discussed for increased data precision this tech- nique may be extended to more than 16 bits of accuracy if required, or may be adapted to make use of increased filter sizes etc. for very high precision systems increased coefficient and data precision may be combined to give very accurate results. 10.2 increasing coefficient precision with an external delay line the second method makes use of a delay line in a very similar configuration to that discussed in the previous section. a diagram showing the setup may be seen in figure 13. the rules discussed earlier in this section about signed coefficients still apply in this configuration. hence if signed coefficients are required then the left and right hand devices in the diagram have to be configured for unsigned and signed coefficient operation respectively and the sign must be ex- tended appropriately. the calculation of the setting of psrc for each device may be calculated in the same manner as described in the previous section. when the calculation is performed the f ollowing relationship is developed: sub n = x n - 1 + 4 s sign ext 22 8 14 30 22 22 22 8 8 8 imsa110 unsigned coeffs imsa110 signed coeffs AN547-12.eps figure 12 : cascade of imsa110s for increased coefficient precision (signed coefficient) cascading imsa110s 12/13
this means that the psrc of device n must be programmed with the value which is in psrc of device n-1 plus a fixed constant of 4. obviously this technique may be extended for more precision or adapted using information presented in earlier sections to give increased filter size, multi pass filtering etc. 10.3 increasing co efficient precision with no external hardware it was discussed in section 9 how to achieve in- creased data precision without using any external hardware. since exactly the same technique may be applied to give increased coefficient precision duplicate details are not given here. 11. summary this document has attempted to describe some of the many ways in which imsa110s may be cas- caded to yield even higher performance. obviously it has not been possible to discuss every possible configuration but hopefully the examples dis- cussed should have provided both an insight into the extensive capabilities of these devices when cascaded, and some simple rules to allow easy setting up of some of the most common forms of cascades. information furnished is believed to be accurate and rel iable. however, sgs-thomson microelectronics assumes no responsibility for the consequences of use of such information nor for any infringement of patents or other rights of third parties which may result from its use. no licence is granted by implication or otherwise under any patent or patent rights of sgs-thomson microelectronics. specifications mentioned in this publication are subject to change without notice. this publication supersedes and replaces all information previously supplied. sgs-thomson microelectronics products are not authorized for use as critical components in lif e support devices or systems without express written approval of sgs-thomson microelectronics. ? 1994 sgs-thomson microelectronics - all rights reserved purchase of i 2 c components of sgs-thomson microelectronics, conveys a license under the philips i 2 c patent. rights to use these components in a i 2 c system, is granted provided that the system conforms to the i 2 c standard specifications as defined by philips. sgs-thomson microelectronics group of companies australia - brazil - china - france - germany - hong kong - it aly - ja pan - korea - malaysia - malta - morocco the netherlands - singapore - spain - sweden - switzerland - taiwan - thailand - united kingdom - u.s.a. cascading imsa110s 13/13

▲Up To Search▲

Price & Availability of AN547

	To Download AN547 Datasheet File
If you can't view the Datasheet, Please click here to try to view without PDF Reader .